Protein Structure Alignment through a Contact Topology Profile using SABERTOOTH
نویسندگان
چکیده
The contact vector (CV) of a protein structure is one of the simplest and most condensed descriptions of protein structure available. It lists the number of contacts each amino acid has with the surrounding structure and has frequently been used e.g. to derive approximative folding energies in protein folding analysis. The CV, however, is a lossy structure representation, as it does not contain sufficient information to allow for the reconstruction of the full protein structure it was derived from. The loss of information leads to a degeneracy in the sense that a single contact vector is compatible with many different contact matrices, but it has been shown that this degeneracy is nearly fully compensated by the physical constraints protein structure is subject to. We recently developed the alignment framework ‘SABERTOOTH’ that is able to generically align connectivity related vectorial structure profiles to compute protein alignments. Here we show that also the CV allows for state-of-the-art alignment quality, just like the elaborated ‘Effective Connectivity’ profile (EC) that SABERTOOTH currently uses. This simplification leeds to a very simple and elegant approach to structure alignment, which accelerates and generalizes the algorithm we previously proposed. Furthermore, we conclude from our work that the CV in itself is a useful structure description if its collective properties are called for.
منابع مشابه
A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences
The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...
متن کاملTransmembrane Protein Alignment and Fold Recognition Based on Predicted Topology
BACKGROUND Although Transmembrane Proteins (TMPs) are highly important in various biological processes and pharmaceutical developments, general prediction of TMP structures is still far from satisfactory. Because TMPs have significantly different physicochemical properties from soluble proteins, current protein structure prediction tools for soluble proteins may not work well for TMPs. With the...
متن کاملHigh-accuracy prediction of transmembrane inter-helix contacts and application to GPCR 3D structure modeling
MOTIVATION Residue-residue contacts across the transmembrane helices dictate the three-dimensional topology of alpha-helical membrane proteins. However, contact determination through experiments is difficult because most transmembrane proteins are hard to crystallize. RESULTS We present a novel method (MemBrain) to derive transmembrane inter-helix contacts from amino acid sequences by combini...
متن کاملA low-complexity add-on score for protein remote homology search with COMER.
Motivation Protein sequence alignment forms the basis for comparative modeling, the most reliable approach to protein structure prediction, among many other applications. Alignment between sequence families, or profile-profile alignment, represents one of the most, if not the most, sensitive means for homology detection but still necessitates improvement. We aim at improving the quality of prof...
متن کاملMSAT: a multiple sequence alignment tool based on TOPS.
This article describes the development of a new method for multiple sequence alignment based on fold-level protein structure alignments, which provides an improvement in accuracy compared with the most commonly used sequence-only-based techniques. This method integrates the widely used, progressive multiple sequence alignment approach ClustalW with the Topology of Protein Structure (TOPS) topol...
متن کامل